Picture for Jianke Zhu

Jianke Zhu

VisionTrim: Unified Vision Token Compression for Training-Free MLLM Acceleration

Add code
Jan 30, 2026
Viaarxiv icon

Interp3D: Correspondence-aware Interpolation for Generative Textured 3D Morphing

Add code
Jan 20, 2026
Viaarxiv icon

Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training for Autonomous Systems

Add code
Dec 30, 2025
Viaarxiv icon

Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future

Add code
Dec 18, 2025
Viaarxiv icon

RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning

Add code
Oct 02, 2025
Figure 1 for RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning
Figure 2 for RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning
Figure 3 for RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning
Figure 4 for RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning
Viaarxiv icon

MambaMap: Online Vectorized HD Map Construction using State Space Model

Add code
Jul 27, 2025
Viaarxiv icon

SAM4D: Segment Anything in Camera and LiDAR Streams

Add code
Jun 26, 2025
Figure 1 for SAM4D: Segment Anything in Camera and LiDAR Streams
Figure 2 for SAM4D: Segment Anything in Camera and LiDAR Streams
Figure 3 for SAM4D: Segment Anything in Camera and LiDAR Streams
Figure 4 for SAM4D: Segment Anything in Camera and LiDAR Streams
Viaarxiv icon

OmniAvatar: Efficient Audio-Driven Avatar Video Generation with Adaptive Body Animation

Add code
Jun 23, 2025
Viaarxiv icon

PixelThink: Towards Efficient Chain-of-Pixel Reasoning

Add code
May 29, 2025
Viaarxiv icon

Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps

Add code
May 24, 2025
Viaarxiv icon